A segment-based non-parametric approach for monophone recognition
نویسندگان
چکیده
In this paper, we propose a segment-based non-parametric method of monophone recognition. We pre-segment the speech utterance into its underlying phonemes using a group-delay-based algorithm. Then, we apply the kNN/SASH phoneme classification technique to classify the hypothesized phonemes. Since phoneme boundaries are already known during the decoding, the search space is very limited and the recognition fast. However, such hard-decisioning leads to missed boundaries and oversegmentations. Therefore, while constructing the graph for an utterance, we use phoneme duration constraints and broad-class similarity information to merge or split the segments and create new branches. We perform a simplified acoustical level monophone recognition task on the TIMIT test database. Since phoneme transitional probabilities are not included, only one (most likely) hypothesis and score is provided for each segment and a simple shortest path search algorithm is applied to find the best phoneme sequence rather than the Viterbi search. This simplified evaluation achieves 58.5% accuracy and 67.8% correctness.
منابع مشابه
Enhancing Vocal Tract Length Normalization with Elastic Registration for Automatic Speech Recognition
Vocal tract length normalization (VTLN) is commonly applied utterance-wise with a warping function that makes the assumption of a linear dependence between the vocal tract length and the location of the formants. In this work we propose a datadriven method for enhancing the performance of systems that already use standard VTLN. The method is based on elastic registration to estimate optimal non...
متن کاملA comparison of parametric and non-parametric methods of standardized precipitation index (SPI) in drought monitoring (Case study: Gorganroud basin)
The Standardized Precipitation Index (SPI) is the most common index for drought monitoring. Although the calculation of this index is usually done by using the gamma distribution fitting of precipitation data, studies have shown that for accurate monitoring of drought, the optimal distribution of precipitation in each month should be determined. On the other hand, in non-stationary time series,...
متن کاملConversion from phoneme based to grapheme based acoustic models for speech recognition
This paper focuses on acoustic modeling in speech recognition. A novel approach how to build grapheme based acoustic models with conversion from existing phoneme based acoustic models is proposed. The grapheme based acoustic models are created as weighted sum from monophone acoustic models. The influence of particular monophone is determined with the phoneme to grapheme confusion matrix. Furthe...
متن کاملA new non-parametric approach for suppliers selection
In this paper we propose a simple non-parametric model for multiple crite-ria supplier selection problem. The proposed model does not generate a zeroweight for a certain criterion and ranks the suppliers without solving the modeln times (one linear programming (LP) for each supplier) and therefore allowsthe manager to get faster results. The methodology is illustrated using anexample.
متن کاملAn STD System for OOV Query Terms Integrating Multiple STD Results of Various Subword units
We have been proposing a Spoken Term Detection (STD) method for Out-Of-Vocabulary (OOV) query terms integrating various subword recognition results using monophone, triphone, demiphone, one third phone, and Sub-phonetic segment (SPS) models. In the proposed method, subword-based ASR (Automatic Speech Recognition) is performed for all spoken documents and subword recognition results are generate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010